DARCI: Distributed Association Rule Mining Utilizing Closed Itemsets
نویسندگان
چکیده
A distributed rule mining algorithm must minimize the communication cost to reduce the communication bandwidth use and to improve the scalability. There are a few distributed rule mining algorithms reported in the literature. In this paper we propose a new distributed association rule mining algorithm, called DARCI, which reduces the communication cost by up to 40% of the best known algorithm. The first step of mining global rules is to find globally frequent itemsets, and it involves the exchange of local supports of potentially frequent itemsets. DARCI reduces the communication cost by (1) sending frequent closed itemsets, which are supersets of locally frequent itemsets, instead of sending individual itemsets and (2) using a novel pruning technique to reduce the probability of having to send local supports.
منابع مشابه
Mining Frequent Closed Itemsets from Highly Distributed Repositories
In this paper we address the problem of mining frequent closed itemsets in a highly distributed setting like a Grid. The extraction of frequent (closed) itemsets is an important problem in Data Mining, and is a very expensive phase needed to extract from a transactional database a reduced set of meaningful association rules, typically used for Market Basket Analysis. We figure out an environmen...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملUsing attribute value lattice to find closed frequent itemsets
Finding all closed frequent itemsets is a key step of association rule mining since the non-redundant association rule can be inferred from all the closed frequent itemsets. In this paper we present a new method for finding closed frequent itemsets based on attribute value lattice. In the new method, we argue that vertical data representation and attribute value lattice can find all closed freq...
متن کاملAccelerating Closed Frequent Itemset Mining by Elimination of Null Transactions
The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...
متن کاملMining Closed Itemsets: A Review
Closed itemset mining is a popular research in data mining. It was proposed to avoid a large number of redundant itemsets in frequent itemset mining. Various algorithms were proposed with efficient strategies to generate closed itemsets. This paper aims to study the existence algorithms used to mine closed itemsets. The various strategies in the algorithms are presented and analyzed in this paper.
متن کامل